Speech recognizer for voice control of mobile telephone

نویسندگان

  • Mats Blomberg
  • Kjell Elenius
  • B. Lundström
  • Lennart Neovius
چکیده

Infovox is marketing a speaker-dependent, pattern-matching word recognition system, developed at KTH. The algorithms in the system have been modified for noise immunity, and performance has been evaluated in moving cars. The main problems were word detection and noise compensation. After simulations we decided to use a close-talking microphone and a "noise addition" method, where we added the measured noise in the moving car to the reference patterns recorded in a parked car. Using this method, the recognition rate was improved from 69% to 97% on a ten-word vocabulary using the best microphone. A more extensive test was performed on the modified recognition system using two cars and twelve speakers, seven male and five female. Most of them were naive speakers. The twenty-word vocabulary contained some confusable words and was trained in a parked car. During 98 sessions, 1,960 words were read under different conditions with an average recognition rate of 86%. With closed windows at 90 km/h the mean was 91%. An open window at the same speed decreased the result to 82%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech spotter: on-demand speech recognition in human-human conversation on the telephone or in face-to-face situations

This paper describes a novel speech-interface function, called “speech spotter”, which enables a user to enter voice commands into a speech recognizer in the midst of natural human-human conversation. In the past, it has been difficult to use automatic speech recognition in human-human conversation since it was not easy to judge, from only microphone input, whether a user was speaking to anothe...

متن کامل

Speech Spotter: On-demand Speech Recognition in Human-Human Conversation on the Telephone or in Face-to-Face Situations / Masataka Goto

This paper describes a novel speech-interface function, called “speech spotter”, which enables a user to enter voice commands into a speech recognizer in the midst of natural human-human conversation. In the past, it has been difficult to use automatic speech recognition in human-human conversation since it was not easy to judge, from only microphone input, whether a user was speaking to anothe...

متن کامل

Real-time telephone transmission simulation for speech recognizer and dialogue system evaluation and improvement

Recognizer performance in telephone-based spoken dialogue systems may be strongly affected by the transmission channel. In order to investigate the impact of different parts of the transmission channel in more detail, a simulation model is presented. It implements all transmission characteristics of modern telephone networks, based on instrumentally measurable values as they are used by network...

متن کامل

Application of isolated word recognition to a voice controlled repertory dialer system

In this paper we describe a speaker trained, voice controlled, repertory dialer system. The main elements of tile system include: 1. A real-time speech analyzer that detects the presence of speech on the input line, and analyzes the speech to give features appropriate for a word recognizer. 2. An isolated word recognizer that decides which of a set of words was spoken. 3. A voice response syste...

متن کامل

User Interface Design for Voice Control Systems

A voice control system converts spoken commands into control actions, a process which is always imperfect due to errors of the speech recognizer. Most speech recognition research is focused on decreasing the recognizers’ error rates; comparatively little effort was spent to find interface designs that optimize the overall system, given a fixed speech recognizer performance. In order to evaluate...

متن کامل

Voice Activity Detection Using Speech Recognizer Feedback

This paper demonstrates how feedback from a speech recognizer can be leveraged to improve Voice Activity Detection (VAD) for online speech recognition. First, reliably transcribed segments of audio are fed back by the recognizer as supervision for VAD model adaptation. This allows the much stronger LVCSR acoustic models to be harnessed without adding computation. Second, when to make a VAD deci...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1987